# Audio Feature Extraction

Distilhubert Finetuned Gtzan
Apache-2.0
Audio classification model fine-tuned on the GTZAN music classification dataset based on the DistilHuBERT architecture, achieving 86% accuracy
Audio Classification Transformers
D
f0ghedgeh0g
39
0
Distilhubert Finetuned Gtzan
Apache-2.0
This model is an audio classification model fine-tuned on the GTZAN music classification dataset based on ntu-spml/distilhubert, achieving an accuracy of 85%.
Audio Classification Transformers
D
Scher314
3
0
Wav2vec2 Base BirdSet XCL
wav2vec 2.0 is a self-supervised learning framework for speech representation learning, capable of learning speech features from unlabeled audio data.
Audio Classification Transformers
W
DBD-research-group
177
0
Hubert Large Gender Auto
Apache-2.0
Gender classifier based on the HuBERT large model with 98.61% accuracy
Audio Classification Transformers
H
ittailup
13
0
Wav2vec2 Base Gender Classification
Apache-2.0
A fine-tuned voice gender classification model based on facebook/wav2vec2-base, achieving 98.92% accuracy on the evaluation set
Audio Classification Transformers
W
7wolf
14
1
Wav2vec2 Audio Emotion Classification
Apache-2.0
A fine-tuned audio emotion classification model based on facebook/wav2vec2-base, achieving 73.98% accuracy on the evaluation set
Audio Classification Transformers
W
chin-may
77
5
Distilhubert Finetuned Gtzan
Apache-2.0
This model is a fine-tuned version of NTU-SPML's DistilHuBERT on the GTZAN music classification dataset, primarily used for music genre classification tasks.
Audio Classification Transformers
D
Terps
15
0
Wav2vec2 Large Robust 6 Ft Age Gender
This model, fine-tuned from Wav2Vec2-Large-Robust, can predict the speaker's age and gender from raw audio.
Audio Classification Transformers
W
audeering
19.29k
2
Audiocourseu4 MusicClassification
Apache-2.0
A music classification model fine-tuned on the GTZAN dataset based on distilhubert, achieving 88% accuracy
Audio Classification Transformers
A
Imxxn
17
0
Distilhubert Finetuned Gtzan
Apache-2.0
A model fine-tuned on the GTZAN music classification dataset based on DistilHuBERT, used for music genre classification tasks
Audio Classification Transformers
D
artyomboyko
16
0
Distilhubert Finetuned Gtzan
Apache-2.0
This model is an audio classification model based on the DistilHuBERT architecture, fine-tuned on the GTZAN music classification dataset, primarily used for music genre classification tasks.
Audio Classification Transformers
D
calvpang
15
0
Distilhubert Finetuned Distilhubert
This model is a fine-tuned version of DistilHuBERT on the GTZAN music classification dataset, primarily used for music genre classification tasks.
Audio Classification Transformers
D
JanLilan
14
0
Distilhubert Finetuned Gtzan
Apache-2.0
A lightweight audio feature extraction model fine-tuned on GTZAN music classification dataset based on DistilHuBERT
Audio Classification Transformers
D
mory91
48
0
Distilhubert Finetuned Gtzan
Apache-2.0
This model is a fine-tuned version of DistilHuBERT on the GTZAN music classification dataset, primarily used for music genre classification tasks.
Audio Classification Transformers
D
Maldopast
14
0
My Awesome Model
Apache-2.0
An audio classification model based on the DistilHuBERT architecture, fine-tuned on the GTZAN music genre classification dataset with an accuracy of 94.75%
Audio Classification Transformers
M
AK-12
15
0
Distilhubert Finetuned Gtzan
Apache-2.0
An audio classification model based on the DistilHuBERT architecture, fine-tuned on the GTZAN music genre classification dataset
Audio Classification Transformers
D
technaxx
20
0
Distilhubert Finetuned Gtzan
Apache-2.0
A lightweight audio classification model fine-tuned on the GTZAN music classification dataset based on the DistilHuBERT architecture
Audio Classification Transformers
D
CornerINCorner
20
0
Distilhubert Finetuned Gtzan
Apache-2.0
This model is an audio classification model fine-tuned on the GTZAN music classification dataset based on DistilHuBERT, achieving an accuracy of 76.25%
Audio Classification Transformers
D
pratik33
14
0
Distilhubert Finetuned Gtzan
Apache-2.0
This model is a fine-tuned version of DistilHuBERT on the GTZAN music classification dataset, primarily used for music genre classification tasks.
Audio Classification Transformers
D
arham061
15
0
Distilhubert Finetuned Gtzan V3 Finetuned Gtzan
Apache-2.0
This model is a fine-tuned version based on the DistilHuBERT architecture on the GTZAN music classification dataset, primarily used for music genre classification tasks.
Audio Classification Transformers
D
J3
13
0
Distilhubert Finetuned Gtzan
Apache-2.0
An audio classification model fine-tuned on the GTZAN music classification dataset based on DistilHuBERT, achieving 85% accuracy
Audio Classification Transformers
D
kfahn
15
0
Distilhubert Finetuned Ravdess
Apache-2.0
A speech emotion recognition model fine-tuned on the RAVDESS dataset based on DistilHuBERT architecture, achieving 92.36% accuracy
Audio Classification Transformers
D
pollner
43
2
Audio Classification Model
Apache-2.0
An audio classification model fine-tuned based on facebook/wav2vec2-base-960h, with specific uses and training data not clearly specified.
Audio Classification Transformers
A
SinghManish
19
1
Distilhubert Finetuned Gtzan V2
Apache-2.0
This model is a fine-tuned version of DistilHuBERT on the GTZAN music classification dataset, primarily used for music genre classification tasks.
Audio Classification Transformers
D
MariaK
17
0
Ast Bird Model
Bsd-3-clause
An audio classification model fine-tuned on audio datasets based on MIT/ast-finetuned-audioset-10-10-0.4593
Audio Classification Transformers
A
saadashraf
22
0
MERT V1 95M
MERT-v1-330M is an advanced music understanding model trained based on the MLM paradigm, with 330M parameters, supporting a 24K Hz audio sampling rate and 75 Hz feature rate, suitable for various music information retrieval tasks.
Audio Classification Transformers
M
m-a-p
83.72k
32
Wav2vec2 Base Finetuned Coscan Age Group
Apache-2.0
Age group classification model fine-tuned on the COSCAN-speech dataset based on wav2vec2-base, achieving 99.8% accuracy on the validation set
Audio Classification Transformers
W
versae
34
0
Wav2vec2 Base Sound2
Apache-2.0
A speech processing model fine-tuned based on facebook/wav2vec2-base, achieving an accuracy of 53.57% on the evaluation set
Audio Classification Transformers
W
learningdude
17
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase